Search CORE

38 research outputs found

Structuring Wikipedia Articles with Section Recommendations

Author: Catasta Michele
Piccardi Tiziano
West Robert
Zia Leila
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/05/2018
Field of study

Sections are the building blocks of Wikipedia articles. They enhance readability and can be used as a structured entry point for creating and expanding articles. Structuring a new or already existing Wikipedia article with sections is a hard task for humans, especially for newcomers or less experienced editors, as it requires significant knowledge about how a well-written article looks for each possible topic. Inspired by this need, the present paper defines the problem of section recommendation for Wikipedia articles and proposes several approaches for tackling it. Our systems can help editors by recommending what sections to add to already existing or newly created Wikipedia articles. Our basic paradigm is to generate recommendations by sourcing sections from articles that are similar to the input article. We explore several ways of defining similarity for this purpose (based on topic modeling, collaborative filtering, and Wikipedia's category system). We use both automatic and human evaluation approaches for assessing the performance of our recommendation system, concluding that the category-based approach works best, achieving precision@10 of about 80% in the human evaluation.Comment: SIGIR '18 camera-read

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Latent Structure in Collaboration: the Case of Reddit r/place

Author: Aberer Karl
Catasta Michele
Rappaz Jérémie
West Robert
Publication venue
Publication date: 16/04/2018
Field of study

Many Web platforms rely on user collaboration to generate high-quality content: Wiki, Q&A communities, etc. Understanding and modeling the different collaborative behaviors is therefore critical. However, collaboration patterns are difficult to capture when the relationships between users are not directly observable, since they need to be inferred from the user actions. In this work, we propose a solution to this problem by adopting a systemic view of collaboration. Rather than modeling the users as independent actors in the system, we capture their coordinated actions with embedding methods which can, in turn, identify shared objectives and predict future user actions. To validate our approach, we perform a study on a dataset comprising more than 16M user actions, recorded on the online collaborative sandbox Reddit r/place. Participants had access to a drawing canvas where they could change the color of one pixel at every fixed time interval. Users were not grouped in teams nor were given any specific goals, yet they organized themselves into a cohesive social fabric and collaborated to the creation of a multitude of artworks. Our contribution in this paper is multi-fold: i) we perform an in-depth analysis of the Reddit r/place collaborative sandbox, extracting insights about its evolution over time; ii) we propose a predictive method that captures the latent structure of the emergent collaborative efforts; and iii) we show that our method provides an interpretable representation of the social structure

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Entity Disambiguation in Tweets leveraging User Social Profiles

Author: Aberer Karl
Catasta Michele
Demartini Gianluca
Yerva Surender Reddy
Publication venue
Publication date: 30/09/2013
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Crossref

deepschema.org: An Ontology for Typing Entities in the Web of Data

Author: Aberer Karl
Catasta Michele
Gupta Amit
Smeros Panayiotis
Publication venue
Publication date: 10/05/2017
Field of study

Discovering the appropriate type of an entity in the Web of Data is still considered an open challenge, given the complexity of the many tasks it entails. Among them, the most notable is the definition of a generic and cross-domain ontology. While the ontologies proposed in the past function mostly as schemata for knowledge bases of different sizes, an ontology for entity typing requires a rich, accurate and easily-traversable type hierarchy. Likewise, it is desirable that the hierarchy contains thousands of nodes and multiple levels, contrary to what a manually curated ontology can offer. Such level of detail is required to describe all the possible environments in which an entity exists in. Furthermore, the generation of the ontology must follow an automated fashion, combining the most widely used data sources and following the speed of the Web. In this paper we propose deepschema.org, the first ontology that combines two well-known ontological resources, Wikidata and schema.org, to obtain a highly-accurate, generic type ontology which is at the same time a first-class citizen in the Web of Data. We describe the automated procedure we used for extracting a class hierarchy from Wikidata and analyze the main characteristics of this hierarchy. We also provide a novel technique for integrating the extracted hierarchy with schema.org, which exploits external dictionary corpora and is based on word embeddings. Finally, we present a crowdsourcing evaluation which showcases the three main aspects of our ontology, namely the accuracy, the traversability and the genericity. The outcome of this paper is published under the portal: http://deepschema.github.io

Infoscience - École polytechnique fédérale de Lausanne

Bartering Books to Beers: a Recommender System for Exchange Platforms

Author: Catasta Michele
McAuley Julian
Rappaz Jérémie
Vladarean Maria-Luiza
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/03/2018
Field of study

Bartering is a timeless practice that is becoming increasingly popular on the Web. Recommending trades for an online bartering platform shares many similarities with traditional approaches to recommendation, in particular the need to model the preferences of users and the properties of the items they consume. However, there are several aspects that make bartering problems interesting and challenging, specifically the fact that users are both suppliers and consumers, and that the trading environment is highly dynamic. Thus, a successful model of bartering requires us to understand not just users’ preferences, but also the social dynamics of who trades with whom, and the temporal dynamics of when trades occur. We propose new models for bartering-based recommendation, for which we introduce three novel datasets from online bartering platforms. Surprisingly, we find that existing methods (based on matching algorithms) perform poorly on real-world platforms, as they rely on idealized assumptions that are not supported by real bartering data. We develop approaches based on Matrix Factorization in order to model the reciprocal interest between users and each other’s items. We also find that the social ties between members have a strong influence, as does the time at which they trade, therefore we extend our model to be socially- and temporally- aware. We evaluate our approach on trades covering books, video games, and beers, where we obtain promising empirical performance compared to existing techniques

Infoscience - École polytechnique fédérale de Lausanne

Next Place Prediction using Mobile Data

Author: Aberer Karl
Catasta Michele
McDowell Lucas Kelsey
Tran Le Hung
Publication venue
Publication date: 06/11/2012
Field of study

Infoscience - École polytechnique fédérale de Lausanne

RoutineSense: A Mobile Sensing Framework for the Reconstruction of User Routines

Author: Aberer Karl
Catasta Michele
Ranvier Jean-Eudes
Vasirani Matteo
Publication venue
Publication date: 04/06/2015
Field of study

Modern smartphones are powerful platforms that have become part of the everyday life for most people. Thanks to their sensing and computing capabilities, smartphones can unobtrusively identify simple user states (e.g., location, performed activity, etc.), enabling a plethora of applications that provide insights on the lifestyle of the users. In this paper, we introduce routineSense: a system for the automatic reconstruction of complex daily routines from simple user states, implemented as an incremental processing framework. Such framework combines opportunistic sensing and user feedback to discover frequent and exceptional routines that can be used to segment and aggregate multiple user activities in a timeline. We use a comprehensive dataset containing rich geographic information to assess the feasibility and performance of routineSense, showing a near threefold improvement on the current state-of-the-art

Infoscience - École polytechnique fédérale de Lausanne

Directory of Open Access Journals

MemorySense: Reconstructing and Ranking User Memories on Mobile Devices

Author: Aberer Karl
Catasta Michele
Radu Horia
Ranvier Jean-Eudes Marie
Vasirani Matteo
Yan Zhixian
Publication venue: New York, Ieee
Publication date: 21/01/2014
Field of study

The richness of user-centric information gathered by modern devices can be used to keep track of memorable events, therefore acting as a prosthesis of the prone-to-forget human memory. We propose to combine virtual and physical sensors from mobile devices to infer digital memories of user activities in a semi-supervised fashion. In MemorySense, sensor data is processed by a space and energy efficient algorithm to recognize basic activities. We then use semantic reasoning to aggregate these activities into the digital equivalent of a human episodic memory

Infoscience - École polytechnique fédérale de Lausanne

Crossref